Analysis of partially observed clustered data using generalized estimating equations and multiple imputation.

نویسندگان

  • Kathryn M Aloisio
  • Sonja A Swanson
  • Nadia Micali
  • Alison Field
  • Nicholas J Horton
چکیده

Clustered data arise in many settings, particularly within the social and biomedical sciences. As an example, multiple-source reports are commonly collected in child and adolescent psychiatric epidemiologic studies where researchers use various informants (e.g. parent and adolescent) to provide a holistic view of a subject's symptomatology. Fitzmaurice et al. (1995) have described estimation of multiple source models using a standard generalized estimating equation (GEE) framework. However, these studies often have missing data due to additional stages of consent and assent required. The usual GEE is unbiased when missingness is Missing Completely at Random (MCAR) in the sense of Little and Rubin (2002). This is a strong assumption that may not be tenable. Other options such as weighted generalized estimating equations (WEEs) are computationally challenging when missingness is non-monotone. Multiple imputation is an attractive method to fit incomplete data models while only requiring the less restrictive Missing at Random (MAR) assumption. Previously estimation of partially observed clustered data was computationally challenging however recent developments in Stata have facilitated their use in practice. We demonstrate how to utilize multiple imputation in conjunction with a GEE to investigate the prevalence of disordered eating symptoms in adolescents reported by parents and adolescents as well as factors associated with concordance and prevalence. The methods are motivated by the Avon Longitudinal Study of Parents and their Children (ALSPAC), a cohort study that enrolled more than 14,000 pregnant mothers in 1991-92 and has followed the health and development of their children at regular intervals. While point estimates were fairly similar to the GEE under MCAR, the MAR model had smaller standard errors, while requiring less stringent assumptions regarding missingness.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A multiple imputation approach to linear regression with clustered censored data.

We extend Wei and Tanner's (1991) multiple imputation approach in semi-parametric linear regression for univariate censored data to clustered censored data. The main idea is to iterate the following two steps: 1) using the data augmentation to impute for censored failure times; 2) fitting a linear model with imputed complete data, which takes into consideration of clustering among failure times...

متن کامل

Modeling In Vitro Fertilization Data Considering Multiple Outcomes Observed among Iranian Infertile Women

Objective Women undergoing IVF cycles should go successfully through multiple points during the procedure (i.e., implantation, clinical pregnancy, no spontaneous abortion and delivery) to achieve live births. On the other there is a need to consider previous reproductive outcomes and as well as the current cycle. In this study, data on multiple cycles and multiple points during the IVF cycle ar...

متن کامل

Does Type of Pain Predict Pain Severity Changes in Individuals With Multiple Sclerosis? A Longitudinal Analysis Using Generalized Estimating Equations

 Background & Objective:  Pain is a common symptom among people with MS. In the majority of MS patients, pain is chronic in nature, but it can change over time. The objective of this study was to determine if pain type can predict pain severity changes in individuals with MS over time.  Materials & Methods:  The research method was a longitudinal design that evaluated pain type and severity at...

متن کامل

Two-step Spline Estimating Equations for Generalized Additive Partially Linear Models with Large Cluster Sizes by Shujie Ma

We propose a two-step estimating procedure for generalized additive partially linear models with clustered data using estimating equations. Our proposed method applies to the case that the number of observations per cluster is allowed to increase with the number of independent subjects. We establish oracle properties for the two-step estimator of each function component such that it performs as...

متن کامل

Estimating the effect of multiple imputation on incomplete longitudinal data with application to a randomized clinical study.

For analyzing incomplete longitudinal data, there has been recent interest in comparing estimates with and without the use of multiple imputation along with mixed effects model and generalized estimating equations. Empirically, the additional use of multiple imputation generally led to overestimated variances and may yield more heavily biased estimates than the use of last observation carried f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • The Stata journal

دوره 14 4  شماره 

صفحات  -

تاریخ انتشار 2014